Shotgun Sequence Assembly

نویسنده

  • Mihai Pop
چکیده

Shotgun sequencing is the most widely used technique for determining the DNA sequence of organisms. It involves breaking up the DNA into many small pieces that can be read by automated sequencing machines, then piecing together the original genome using specialized software programs called assemblers. Due to the large amounts of data being generated and to the complex structure of most organisms’ genomes, successful assembly programs rely on sophisticated algorithms based on knowledge from such diverse fields as statistics, graph theory, computer science, and computer engineering. Throughout this chapter we will describe the main computational challenges imposed by the shotgun sequencing method, and survey the most widely used assembly algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for DNA Sequence Assembly

Since the advent of rapid DNA sequencing methods in 1976, scientists have had the problem of inferring DNA sequences from sequenced fragments. Shotgun sequencing is a well-established biological and computational method used in practice. Many conventional algorithms for shotgun sequencing are based on the notion of pairwise fragment overlap. While shotgun sequencing infers a DNA sequence given ...

متن کامل

A probabilistic approach to sequence assembly validation

ABSTRACT Sequence assembly is an essential requirement for determining the complete sequence of long DNA. However, sequence assembly programs often generate misassembled contigs by either joining di erent repeat copies, resulting in joining non contiguous DNA regions (inverted or swapped) or by including many fragments from di erent repeat copies resulting in errors in the consensus sequence (n...

متن کامل

Sequence determination from overlapping fragments: a simple model of whole-genome shotgun sequencing.

Assembling fragments randomly sampled from along a sequence is the basis of whole-genome shotgun sequencing, a technique used to map the DNA of the human and other genomes. We calculate the probability that a random sequence can be recovered from a collection of overlapping fragments. We provide an exact solution for an infinite alphabet and in the case of constant overlaps. For the general pro...

متن کامل

Whole Genome Assemblies of the Drosophila and Human Genomes

Shotgun sequence assembly is a classic inverse problem: given a set of segments randomly sampled from a target sequence, the problem is to reconstruct the target. Early programs for this problem assisted a user by finding potential overlapping segments which were then assembled by hand. As the programs became progressively more sophisticated the problem was completely solved by the software but...

متن کامل

Sequencing and Assembly of the 22-Gb Loblolly Pine Genome

Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Advances in Computers

دوره 60  شماره 

صفحات  -

تاریخ انتشار 2004